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Abstract: 


Bradford’s law of bibliographic scattering is a fundamental principle in bibliometrics, offering 
valuable guidance for academic libraries in literature search and procurement. However, Bradford 
curves can exhibit various shapes over time, and predicting these shapes remains a challenge due to 
a lack of causal explanation. This paper attributes the deviations from the theoretical J-shape to 
integer constraints on the number of journals and articles, extending Leimkuhler and Egghe’s 
formulas to encompass highly productive core journals, where the theoretical journal number falls 
below one. Using the Simon-Yule model, key parameters of the extended formulas are identified 
and analyzed. The paper explains the reasons for the Groos Droop and examines the critical points 
for shape changes. The proposed formulas are validated with empirical data from literature, 
demonstrating that this method can effectively predict the evolution of Bradford curves, thereby 


aiding academic libraries in the procurement and utilization of scientific literature. 
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1 Introduction 

1.1 Introduction 


As one of the three fundamental laws of bibliometrics, Bradford’s law of bibliographic 
scattering has numerous potential applications in academic libraries. For instance, it can help 
identify the core journals or publishers in a specific research area, guiding librarians in journal and 
book procurement decisions (Barrantes et al., 2023). Additionally, it can assist readers and librarians 
in literature searches by quickly pinpointing key WoS research areas or IPC classes related to a topic 
(Sheikh et al., 2022). However, preparing Bradford curves can be time-consuming, particularly for 
journals with only one or two relevant papers. Moreover, since the scientific literature in a certain 
discipline or research area often increases exponentially or passes through various developmental 
stages (Larivière et al., 2008), a Bradford curve prepared at any given time may not be directly 
applicable many years later without adjustment. Some mathematical formulas can predict the shape 
of Bradford curve, but they typically result in a J-shaped curve. In practice, Bradford’s curve can 
take at least six different shapes, with the notable S-shaped curve with the so-called Groos Droop 
(Groos, 1967). There is a lack of causal explanations for this bibliometric law and comprehensive 
empirical examples (Wagner-D6bler, 1997). Therefore, predicting the evolution of Bradford curve 


remains an open question that warrants further investigation. 


This paper attributes the different shapes of Bradford curve to the integer constraints of journal 
and paper numbers. If journal productivity n is high enough that the corresponding theoretical 
journal number f,(n) = C/n® falls below one, the actual journal number f(n) can only be zero 


or one. Consequently, the discrete nature of journal numbers causes the core zone to deviate from 
the theoretical results of the Lotka or Simon-Yule model. To address this issue, this paper proposes 
two different formulas for the core zone and the normal zone. Key parameters of these formulas are 
identified and studied through theoretical analysis and Monte Carlo simulation of the Simon-Yule 
model. The causes of the Groos Droop are explained, and the critical points for shape changes are 
examined. Finally, the proposed formulas are validated using empirical data from the literature. The 
findings suggest that the proposed method can predict the evolution of Bradford's curves, thereby 


guiding academic libraries in the procurement and utilization of scientific literature. 
1.2 Literature Review 


Bradford’s law was first proposed by Bradford in 1934 (Bradford, 1934) but did not gain wide 
recognition until Vickery further developed the theory in 1948 (Vickery, 1948). According to 
Bradford’s law, if journals are arranged in descending order of productivity and divided into p 
groups with the same number of papers, the number of journals in each group n; follows the ratio 
Ny: NgittiNy = 1:k:++:kP7t, where k is the Bradford multiplier. Besides this verbal form, 
Bradford's law can also be depicted as a J-shaped curve by plotting the accumulated productivity 
R(r) of the first r journals against the natural logarithm of the journal rank r. Leimkuhler 
proposed the mathematical formula for Bradford's curve in 1967 (Leimkuhler, 1967), and Egghe 
developed a method for determining the parameters of this formula in 1990 (Egghe, 1990). In 
Egghe’s formula, R(r) = alog(1+ br), where the key parameters a and b can be calculated 
from the article number A, journal number T, and the productivity Ym of the most productive 
journal. Although Egghe's formula matches well with many bibliographies, it corresponds to a J- 
shaped curve, which deviates from those with a Groos Droop (Egghe, 1990). 


Incomplete bibliographies were initially believed to cause the Groos Droop, but further 
research refuted this hypothesis (Qiu & Tague, 1990). Egghe demonstrated that if the ranking of 
each journal r is transformed into r' =r + 7 by adding a large constant rọ > 1/b, then new 
curve will concave downwards, showing a Groos Droop (Egghe & Rousseau, 1988). The merging 
of different bibliographies, each with a different maximum journal productivity yo could explain 
the large constant rọ (Egghe & Rousseau, 1988). However, it is also likely that the large core 
regions (regions with the most productive journals where f;(n;) < 1) of some bibliographies 
contribute to the large rọ (Chen & Leimkuhler, 1987). Essentially, Ym in Egghe's formula denotes 
the journal productivity where f;(¥m) = C/y ~ 1 (Egghe, 1985), rather than the maximum yield 
X, ofajournal as claimed by Egghe himself. Thus, if the total number of these journals Ty exceeds 
the critical value rọ = 1/b, a Groos Droop will emerge. This paper adopts this explanation and 
extends Egghe's formula to predict the evolution of Bradford's curves. 


In the 1990s, research interest in Bradford's law shifted from the static presentation of data at 
a particular time to its dynamic and evolutionary aspects (Olui¢-Vukovicé, 1998). Oluić- Vuković 
studied how the increase in productivity of core journals affected the shape of the distribution curve 
over time (Oluić-Vuković, 1989). By analyzing the research output of Croatian scholars in different 
subjects, she concluded that the Groos Droop or S-shaped curve is caused by an increase in the 
concentration/dispersal disparity, reflected by the rise in the core/periphery ratio (Oluić-Vuković, 
1991). The dynamic evolution of Bradford curves and the emergence of the Groos Droop are 
presented in her 1992 study (Olui¢- Vuković, 1992), and other similar empirical studies partitioning 


bibliographies over time were conducted by Garg (Garg et al., 1993), Wagner-Dobler (Wagner- 
Dobler, 1997) and Sen (Sen & Chatterjee, 1998). 


Meanwhile, stochastic models like the Simon-Yule model have increasingly been used to study 
the dynamic characteristics of bibliometric laws (Oluić- Vuković, 1997, 1998). Initially introduced 
by Yule in 1924 for studying the distribution of biological genera by species number, the Simon- 
Yule model gained recognition when Simon expanded it in 1955 to analyze the frequency 
distributions of words in writing samples (Simon, 1955). Besides employing theoretical methods 
for precisely solving the constant entry rate œ of new sources (Simon, 1955), Monte Carlo 
simulations have been used to explore more complex scenarios, such as declining entry rates a; 
(Simon & Van Wormer, 1963) and autocorrelated growth rates y of established journals (also 
referred to as aging or obsolescence rate) (Ijiri & Simon, 1977). Chen et al. (Chen, 1989; Chen et 
al., 1994; Chen et al., 1995) first used the Simon-Yule model to study the evolution of Lotka's and 
Bradford's laws over time. They found the entry rate a, and the autocorrelated growth rate y have 
significant yet opposite effect on the Bradford curves, offering an explanation for the various types 
of Bradford curves (Chen et al., 1995). 


Later, Oluié-Vukovic also explored the dynamics of Bradford distribution using the Simon- 
Yule model but found its steady-state solution too restricted to handle time variations, limiting its 
applicability (Olui¢é- Vuković, 1997, 1998). This paper also utilizes the Simon-Yule model to 
examine different scenarios’ effects on key parameters (e.g., journal number To, article number Ay 
and maximum productivity X, of the core region) of the extended Egghe’s formula. However, it is 
not used directly to forecast the evolution of Bradford curves or to compare them with empirical 
data. Instead, key parameters are estimated from past empirical data to improve predictions of 
Bradford curve evolution in the future. 


2 Theoretical Study 
2.1 Simon-Yule Model 


The Simon’s generating mechanism for the Bradford distribution is based on the following two 
assumptions, where the f,(n,t) denotes the number of journals that have published exactly n 
papers in the first t published papers. 


Assumption I: There is a constant probability œ that the (t + 1)-th paper is published in a 
new journal — a journal that has not published in the first t papers; 


Assumption II: The probability that the (t + 1)-th paper is published in a journal that has 
published n papers is proportional to nf(n,t) — that is, to the total numbers of papers of all 
journals that have published exactly n papers. 


Therefore, if there are A papers at a given time, then the corresponding journal number T is 
approximately T = Aa. Based on Simon’s two assumptions, the steady-state solution of the 
Bradford distribution can be written as (Chen, 1989): 


feln) = pB(n,p + 1) = pI'(p + In FY (1) 


where B is the beta function, [ is the gamma function, and p is a function of the entry rate of 
new journal a, defined as p = 1/(1 — æ). Equation (1) suggests that the analytical outcome of 


the Simon-Yule model aligns with the Lotka’s law when p ~ 1. 


In addition to the analytical solutions, Monte Carlo simulations are conducted for a = 0.15, 
and the results are compared with the theoretical results of Equation (1), as shown in Figure 1. 
Detailed procedures for these simulations can be found in (Simon & Van Wormer, 1963) and are 
thus omitted here. To reduce inherent randomness, each case is simulated N = 10* times, utilizing 
only the medians of these simulations as the final outputs. 


Figure 1 illustrates distinct zones in simulation results: a normal zone (blue circles) and a core 
zone (red squares). This distinction arises from the necessity for actual journal numbers f(n) to 
be integers, unable to fall below one. Thus, when the journal productivity n is high enough for 
theoretical journal number f,(n) to drop below one, actual journal numbers f(n) are constrained 
to zone or one, deviating from theoretical predictions as shown by the red squares and black dots in 
Figure 1. Moreover, despite the smaller number of journals in the core region, their contribution to 
paper count is significant, as shown in Figure 1(b). Hence, accurate prediction of paper count for 
each journal X, in the core zone is crucial for depicting the Bradford curve faithfully. 
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Figure 1 Comparisons of the theoretical and numerical results: (a) number of journals f(n) with 
productivity n; (b) number of papers nf (n) produced by journals with productivity n. 


Estimating journal productivity X, of the core region involves determining journal count To 
and paper count Ag first. Since Figure 1 shows a close match between journal number f(n) and 
paper number nf (n) in the normal region, total journal count T, and total paper count A, of the 
normal region can be directly obtained by summing all journal and paper numbers. These are 
calculated as T} = 0?™, f(n) and A, = $X” nf (n), where ym is the journal productivity when 
fiOm) = 1. In this context, Ym can be seen as the productivity of the most productive journal in 
the normal region, as indicated by the black diamond in Figure 1. According to Equation (1), the 


analytical expression for Ym can be derived as: 


1 
Ym = [A(p — Drp + 1)]JP*2 (2) 


Once Ym is calculated, total journal count Tọ and paper count Ay of the core region can be 
calculated as Ty = T—T, and Ay = A — Aj. Alternatively, they can be directly calculated as: 


To ~| TF(n) dn =" (3) 


+00 ye 
Ao © i Tnf (n) dn =—* (4) 
Ym Pp 1 


where Ym is calculated using Equation (2). 
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Figure 2 Journal productivity in the core region X, as a functions of journal rank r: (a) Journal 
productivity X, as a function of r; (b) Journal productivity ratio X4/X, as a function of r. 


In the Simon-Yule model with constant entry rate a, the maximum number of papers one 
journal can have, X,, can be estimated using Gumbel’s r-th characteristic extreme theory (Glanzel, 
2010, 2013): 


+00 r 
Gx,)~ | fai =F (5) 
Xr 
By solving this equation, it can be derived that the productivity of the most productive journal, 
X,, can be expressed as: 


1 fri eet 
X, = [AT(p + DIP = (p—1) Py, (6) 


The productivity of the r-th most productivity journal, X,., is related to X, by X, =X r71/?. 
Figure 2 compares Gumbel’s r-th characteristic extreme values with the medians of the simulation 
results. While Gumbel’s theory effectively predicts the largest paper number X4, it falls short in 
estimating other paper numbers in the core zone X,, for r = 2,3,---,T). Thus, an alternative 
method is assumed in this paper, where all other X,, for r = 2,3,-:-,Tp, are related to the largest 
paper number X, through the equation: 


Xı 

—=k(r-1)+1 (7) 
X, 

where k is the only parameter waiting to be determined. This equation can be derived from the 
Equation (8) of Reference (Chen, 1989), by assuming (7%; —1)/(4 +b) and —1/c are 
relatively small. The validity of this equation is also supported by Figure 2(b), where the blue circles 
represent the simulation results, and the blue dashed lines show the linear fitting results. Hence, the 
productivity of the r-th most productive journal can be derived from Equation (7), and the 


cumulative productivity of the first r most productive journals can be written as: 


— 3 xy 
R.(r) = rears (8) 


If there are Tọ journals with Ag papers in the core region and the numbers of Tọ and Ay 
are known, then the parameter k can be calculated from the equation Re(To) = Ao. Then, 
Equation (8) can be used to predict the evolution of the core regions (r < To) of Bradford’s curves. 


2.2 Egghe’s formula 


After removing the Ty journals and Ag papers of the core region, the remaining T, journals 
and A, papers align well with the theoretical results predicted by Equation (1). Consequently, 
they follow the Lotka’s law, and their Bradford curve can be predicted using the revised Leimkuhler 
and Egghe’s formula (Egghe, 1990): 


R(%) = alog(1 + br,) (9) 


where the key parameters a and b are defined as: 


A, 

a = ——_—_ 10 
log(e’ ym) (0) 
eYym— 1 

b = ———_ 11 

T, (11) 


where y is Euler’s constant, y ~ 0.5772, and Ym is the journal productivity when the 
corresponding theoretical journal number f(Ym) ~ 1. While Ym can be directly calculated from 
Equation (2), it can also be estimated using the following equation if the values of X4, Tọ and Ao 


are known: 


Xı 


PaA e FA oe 


Since the core region’s journal productivity is higher than the normal region, these significant 
journals rank lower. Consequently, the Bradford curve for the normal region starts at the point 
(To, Ao). Each rank 7, in the normal region should be transformed to r = r, + Tọ, and the 
cumulative productivity of the first r journals R(r) should be transformed into Rp (r) = R(%) + 


Ag. The revised Egghe’s formula for the normal region is then written as: 
Ra (r) = R(r — To) + Ao = alog[1 + b(r — To)] + Ao (13) 


Equation (13) can be used to predict the dynamic evolution of the normal regions (Ty < r < 
T) of the Bradford’s curve. Therefore, Equations (8) and (13) together can be used to predict the 
dynamic evolution of the Bradford’s curves. 
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Figure 3 the evolution of the Bradford curve and the cause of the Groos Droop: (a) the evolution of 
the Bradford curve; (b) the cause of the Groos Droop 


Figure 3(a) displays the Bradford curve. The blue circles represent the normal zone, while the 
red squares indicate the core zone. The blue dashed lines show the prediction results of Equation 
(13), and the red dotted lines show the prediction results of Equation (8). The black upper triangle, 
diamond and lower triangle represent the points (1, X1), (To, Ao) and the (T,A) respectively. 
From the discussion above, it is evident that these three points and two lines are crucial for predicting 
the evolution of the Bradford curve. 


2.3 Groos Droop 


Groos was the first to observe that in some datasets, when the journal productivity is low, the 
Bradford curve tend to bend downward (Groos, 1967). Egghe (Egghe & Rousseau, 1988) explained 
the cause of the Groos Droop as merging datasets. However, this section demonstrates that the core 


region’s existence causes the Groos Droop in the normal region. 


The first and second derivatives of R,(r) for the core region can be derived from Equation 


(8): 
OR-(r) _ Xr 
A(logr) k(r—1)+1 a) 
ð Rr) X -k)r (15) 


Adlogr)2 [k(r —1) + 1]? 


a?R-(r) 
> a(logr)? 


From Equation (15), it can be noted that when k > 1 < 0, causing the Bradford 


a?R-(r) 
> a(logr)? 


curve for the core region to concave downward. Conversely, when k < 1 > 0, causing the 


curve to concave upward. As the entry rate of new journals æ increases, the number of journals T 
rises, and the distribution of articles become more dispersed, leading to a decrease in the largest 
journal productivity X4. From Equation (8) and R,(T>) = Ao, we know that a lower X4 results 
in a lower k if Ag is relatively constant. Therefore, as œ increases, the Bradford curve for the 
core region will gradually concave upward, as shown in Figure 3. 


Similarly, the first and second derivatives of R (r) for the normal region can be derived from 
Equation (13): 


ORn(r) _ abr 
A(logr) b(r—T)) +1 


(16) 


°R (r)  ab(1 — bT)r 


alog)? De- n) +1) a7) 


3?R(r) 
> Adogr)? 


From Equation (17), it can be noted that when Ty > 1/b < 0, the Bradford’s curve 


for the normal region will concave downward, showing a Groos Droop. When Tọ < 1/b, 


a?R-(r) 
d(logr) 


> 0, the curve will concave upwards, forming a J-shaped curve. As the entry rate of new 


journals a increases, Figure 3(a) shows that the Tọ will eventually fall below 1/b, causing the 


Bradford curve for the normal region to concave upward, similar to the core region. 


Figure 3 (b) illustrates the variation of key parameters Tọ, 1/b and k with the entry rate a. 
When A = 10%, the normal region will start to concave upward at critical point a, ~ 0.2, while 
the core region will do so at æ; ~ 0.3. Therefore, when a < 0.2, the entire Bradford curve will 
concave downward; when æ > 0.3, it will concave upward. For 0.2 < æ < 0.3, the Bradford curve 
will exhibit a reversed S shape, with the core region concaving downward and the normal region 
concaving upward. Figure 3(a) shows the three shapes of Bradford curves. In this specific case, 
since @n < dz, there is no S-shaped Bradford curve. This is because the aging of journals is not 
considered here, making the largest journal productivity X, relatively large. When the aging effects 
are considered, which will be discussed in detail in Section 3.2, X, decreases significantly, which 
results in a lower k and thus a much lower a,. If a. < @,, an S-shaped Bradford curve will 
appear for a, < a < @p, with the core region concaving upward and the normal region concaving 
downward. 


2.4 Bradford Dynamics 


Given the analytical expression of Ty, Ag and X, (Equations (3), (4), and (6)), the core 
region of the Bradford curve at any time can be predicted using Equation (8). The parameters for 
the normal regions can be derived from these factors through T, = T — To, A; =A — Ap and 
Equation (12), allowing the normal region of the Bradford curve to be predicted using Equation 
(13). 


The Bradford’s curves for the constant entry rate scenario are shown in Figure 4(a), Here, the 
red squares and blue circles represent the simulation results for the core and normal regions, 
respectively, while the red dashed lines and blue dotted lines represent the theoretical results of 
Equations (8) and (13). The key points (1,X,), (To, Ao) and (T,A) are also shown as black 
upper triangle, diamond and lower triangle in Figure 4(a). It is notable that the although the core 
region contains far fewer journals than the normal region, its representation is significant due to the 


x-axis’s log scale. Consequently, journals with lower ranks are better represented in Figure 4(a). 


Figure 3 shows that when a = 0.1, the entire Bradford curve concaves downward for 10? < 
A < 10*. Figure 4(a) depicts the evolution of Bradford curves as the paper number A increases 
from 10? to 104, aligning well with theoretical predictions. 


Figure 4(b) presents the simulation and analytical results for Tọ, Ag and X,. Hollow symbols 


represent the simulation results, while solid symbols denote the analytical ones. It can be observed 
that these three key factors are linear functions of the paper number A, as indicated by Equation 
(18): 


log(Y) = ap + bplog(A) (18) 


where the constant a, and b, are functions of p. Since p is approximately one, a, and bp 
can be considered constants. The close match between analytical and numerical results confirms the 
validity of Equations (3), (4), and (6). The analytical result for Tọ is slightly lower than the 
numerical ones because, in the numerical results, all journals with only one relevant paper are 


considered part of the core region, whereas, in theory, some of them belong to the normal region. 


10000 r $ 104 
O Normalzone A (1,X1) a” ma 4 
9000+} © Core zone @ (h4) g A a-#-* 4 
—---—R,(r) v TA ae ane ee ae al 
8000 i 4 Loppan go 
if. ae a 


Parameter Value Y 
3 
Po 


Accumulated Journal Productivity, R(r) 
a 
[a] 
Q 
© 


F t+ 
3000, F 
101+ 
2000 O Numerical 7, Fitting 7, + Analytical 7, 
o Numerical 4, — — Fitting A, x Analytical 4 
1000 $ 88> i) 0 0) 
Z$ e eacass@o0e-e-O O Numerical ¥, —-—-— Fitting X, æ Analytical X, 


10° 10! 10? 
Journal Rank, r 


10° 


Article Number, A 


Figure 4 the dynamics of the Bradford curve and the variation of key parameters when a = 0.15. 
(a) the dynamics of the Bradford curves; (b) the variations of key parameters. 


3 Numerical Study 
3.1 Decreasing Entry Rate 
Assume the probability of adding a new journal decreases linearly over the time. 


a(t) =a, —kt (19) 


where k is aconstant, k = (as = as) / Ar, where af and Ap are the entry rate of new journals 
and the total number of articles in the final state, respectively. The accumulated number of journals 
can then be expressed as: 


A 
1 
T= X a(t) = aA -3 k4? (20) 
t=1 


Using Equation (20), a quadratic fitting of T and A allows us to determine the values of a, 
and ay. Once these are known, the average entry rate @ = (as + as) /2 can be used to calculate 
the analytical results. 


Figure 5 shows the dynamic evolution of Bradford curves and the variations of key parameters 
when the entry rate decreases linearly from 0.2 to 0.1. The proposed method effectively predicts 
these variations, with analytical results using @ matching well with the simulation results. 
Although the numerical results for Ag and X, are slightly lower than the analytical ones, this 


suggests that a decreasing entry rate has a slight negative impact on their increase. However, the 
effect on the overall shape of the Bradford’s curve and key parameters is relatively insignificant, 
indicating that the analytical results for a constant entry rate @ can still be used to predict key 


parameters without significant errors. 
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Figure 5 the dynamics of the Bradford curve and the variation of key parameters when a decreases 
linearly from 0.2 to 0.1. (a) the dynamics of the Bradford curves; (b) the variations of key parameters. 


3.2 Aging Rate of Journals 


Simon assumes that only one paper gets published in each time period, and models the 
probability of a journal increasing in paper number as proportional to a weighted sum of its past 
increments. These increments are weighted by a factor that decreases geometrically over time, with 
the rate of decrease denoted as y. 


Let y;(k) represent the change in paper number of the j-th journal during the k-th time 
interval, where y,(k) is either 1 (indicating a unit increment) or 0 (indicating no change). The 
paper number of the j-th journal at the end of the k-th interval is given by Y*_,y;(t) The 


expected increment in paper number during the (k + 1)-th interval is: 


k 
1 
plyk +) = 1 =—-) yor (21) 
T=1 


where Wẹ is a time-dependent function consistent across all journals, defined as W, = 
y= wj(k) with w;(k) = Ne yj) y*-t. The parameter y determines how quickly the 


influence of past growth diminished and is thus referred to as the aging rate of journals in this paper. 


Figure 6 illustrates the impact of the aging rate of journals y on the dynamics of Bradford’s 
curves and key parameters. It shows that the aging factor increases Tp, making the normal region 
more concave downward. The aging effect also markedly reduces X, by weakening the Mathew 
effect, where successful journals attract more papers. As older journals lose their appeal, this 
“success breeds success” effect diminished, leading to a substantial decrease in X,, as shown in 
Figure 6(a). Consequently, while the number of articles Ag in the core zone remains relatively 
unchanged, Tọ must increase to offset the reduction in X,. This expansion of the core region 
increases its share of the Bradford curve, shaping it into a J-shape due to the reduction of k and 
X14, as discussed in Section 2.3. Additionally, with the increase in To, it is more likely that Tọ will 


exceed 1/b, further contributing to the concave downward shape of the normal region. In summary, 
the aging effect of journals facilitates the Bradford curve to more easily adopt an S-shape. 
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Figure 6 the dynamics of the Bradford curve and the variation of key parameters when æ = 0.15 
and y = 0.95: (a) the dynamics of the Bradford curves; (b) the variations of key parameters. 


3.3 Varying Entry and Aging Rates 


Figure 7 shows the impact of varying entry and aging rates on Bradford curves. In real-world 
scenarios, both the entry rate and the aging rate often changes steadily. For example, the entry rate 
might decrease linearly from 0.2 to 0.1, while the aging rate increases linearly from 0.95 to 1.0, and 
the simulation results are shown in Figure 7. Comparing Figures 5, 6, and 7 reveals that the effects 
of decreasing entry rate and increasing aging rate are similar to those observed with constant entry 
and aging rate (Figure 6). However, in Figure 7(b), both the article number Ag and journal number 
Ty of are even lower compared to Figure 6(b), indicating that the decreasing entry rate further 
exacerbates their decrease. Consequently, the normal region of Bradford curve becomes less 
concave downward, and its starting point on the y-axis is notably lower. Importantly, all three key 
factors continue to show linear relationships with the article number, suggesting that Equation (18) 
remains useful for predicting them. 
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Figure 7 the dynamics of the Bradford curve and the variation of key parameters when a decreases 
linearly from 0.2 to 0.1 and y increases linearly from 0.95 to 1.0. (a) the dynamics of the Bradford 
curves; (b) the variations of key parameters. 


4 Empirical Study 


4.1 Dataset of Croatian Chemistry Research 


Olui¢é-Vukovié used the research output in chemistry by authors from Croatia to prepare full 
bibliographic references for a ten-year period (Oluić- Vuković, 1992). This dataset includes only 
articles published in journals, comprising 2,543 papers across 416 journals over a decade. The 
productivity of the top few (fewer than 10) most prolific journals was taken directly from Figure 1 
in (Oluié- Vuković, 1992), while the productivity of other journals was taken from Tables 4 and 6 
in (Oluić- Vuković, 1998). Data from the figures were adjusted to match the total number of journals 
and articles provided in Table 2 of (Oluić- Vuković, 1992). 


To predict the dynamics of Bradford’s curve, the first step is to predict the variation of the total 
article number A(t) over time t. Logistic regression analysis (Verhulst, 1838) was applied to the 
empirical data to predict the total article number A(t) at any given time, as shown in Figure 8(a). 
Next, the total journal number T and the entry rate of new journals a were estimated by plotting 
the total journal number T against the total article number A and applying a linear fit of T = Aa, 
as shown in Figure 8(b). Once the point (T, A) is determined for any time, linear regression of 
Equation (18) is used to determine the three key parameters Tọ, Ag and X, on a log-log axis, 
with the fitting results shown in Figure 9(a). This allows the key points (Tọ, Ao) and (1,X,) to be 
determined at any time. Finally, based on these three key points, Equations (8) and (13) can be 
used to determine the Bradford curve for the core region and the normal region, respectively, with 
results shown as red dashed lines and blue dotted lines in Figure 9(b). 
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Figure 8 The process of determining the point (T,A) for any given time. (a) the total article number 
A(t) asa function of time t; (b) the total journal number T as a function of the article number A. 
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Figure 9 The procedures for predicting the evolution of Bradford curve. (a) the variations of key 
parameters Tọ, Ag and X, with the article number A; (b) the dynamics of the Bradford curves. 


The Bradford curves shown in Figure 9(b) align closely with the empirical data, demonstrating 
a strong match between the predicted and observed outcomes. Notably, the Bradford curve 
transitions gradually from a J-shape to an S-shape, a transformation that is accurately captured by 
the analytical predictions. 


4.2 Dataset of Solar Power Research 


The bibliographies on solar power research for the years 1971, 1974, 1977, 1980, 1983, and 
1986, encompassing papers published in journals from the Engineering Index, were compiled by 
Garg et al. (Garg et al., 1993). The data for this analysis were directly extracted from Tables 1-7 of 
the referenced study. 


Similar to the Croatian Chemistry Dataset, predicting the dynamics of Bradford’s curve 
involves the following four steps: 


1. Predicting Article Number A: Apply logistic regression to the empirical data of cumulative 
article numbers A(t) over time t (from Table 7) to predict article numbers for the desired 
intervals, as shown in Figure 10(a). 


2. Predicting Journal Number T: Apply the quadratic fitting of Equation (20) to the journal and 
article pairs to estimate the journal number T and entry rate of new journals a given article 


number A, as illustrated in Figure 10(b). 


3. Predicting key parameters Tọ, Ag and X,: Use the linear fitting of Equation (18) on 
empirical data of Tọ, Ag and X; in the log-log axis to obtain their predictions at any article 
number A, as depicted in Figure 11(a). 


4. Drawing Bradford Curve: Apply Equations (8) and (13) to draw the Bradford’s curve for 
the core region and the normal region, respectively, as shown in Figure 11(b). 
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Figure 10 the process of determining the point (T,A) for any given time. (a) the total article 
number A(t) as a function of time t; (b) the total journal number T as a function of the article 
number A. 
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Figure 11 the procedures for predicting the evolution of Bradford curve. (a) the variations of key 
parameters Tọ, Ag and X, with the article number A; (b) the dynamics of the Bradford curves. 


Figures 9(b) and 11(b) demonstrate that while the proposed method can predict the general 
trend of Bradford’s curves, there are inherent errors. These errors arise because Bradford’s law 
inherently contains uncertainties. Numerical studies reveal that the article numbers for each journal 
rank have a large standard deviation, making it practically impossible to predict the precise shape 
of Bradford’s curve. Additionally, the various fitting procedures introduce errors into the process. 
Therefore, this method can only predict the general trend of Bradford’s dynamics, but cannot 


accurately predict the article number for each journal at any given time. 


Another issue with this method is that the first derivatives of the core region (Equation (14)) 
and the normal region (Equation (16)) differ at the (Tọ, Ao) point, resulting in a non-smooth 
analytical curve at the intersection point. In contrast, the numerical simulation results are smooth 
throughout. This problem could be addressed by proposing more complex formulas for the normal 
region, but this would complicate the overall method. Given the difficulty in accurately predicting 
Bradford’s curve dynamics, this aspect is not explored further in this paper. 


5 Conclusion 


This paper examines how integer constraints on the number of journals T and articles A 
affect the shape of Bradford’s curve, dividing it into two distinct zones: the core zone and the normal 
zone, based on the significance of these integer effects. Using the Simon-Yule model, we derive 
analytical results for key parameters and distributions under a constant entry rate. Theoretical 
formulas for each zone are developed, and the reasons behind the various shapes of Bradford’s 
curves are analyzed. Monte Carlo simulations are employed to study the impact of decreasing entry 
rates of new journals and aging rates of journals on the shape of Bradford’s curve and key parameters. 
Finally, we validate our proposed method using empirical data from Croatian Chemistry and Solar 


Power research datasets. The main conclusions are: 


1. Bradford’s curve should be divided into two separate zones based on the significance of integer 
constraints on journal and article numbers. Different formulas for each zone should be derived 


separately. 


2. Bradford’s curve can exhibit four different shapes, determined by the second derivatives of the 


core and normal zones. 


3. The largest productivity X4, the number of journals Tj, and the number of articles Ag are 
key parameters influencing the shapes of Bradford’s curves. Decreasing entry rates and aging 
rates of journals affect these parameters. 


4. The proposed four-step method can predict general trends in Bradford’s curves despite some 


errors. 


The conclusions of this paper can guide academic libraries in procuring and utilizing scientific 


literature effectively. 
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